Towards Detailed Recognition of Visual Categories
نویسنده
چکیده
As humans, we have a remarkable ability to perceive the world around us in minute detail purely from the light that is reflected off it – we can estimate material and metric properties of objects, localize people in images, describe what they are doing, and even identify them. Automatic methods for such detailed recognition of images are essential for most human-centric applications and large scale analysis of the content of media collections for market research, advertisement, and social studies. For example, in order to shop for shoes in an on-line catalogue, a system should be able to understand the style of a shoe, the length of its heels, or the shininess of its material. In order to support visual demographics analysis for advertisement, a system should be able to not only identify the people in a scene, but also to understand what kind (style and brand) of clothes they are wearing, whether they are wearing any accessories, and so on. Despite several successes, such detailed recognition is beyond the current computer vision systems. This is a challenging task, and to make progress we have to make advances on several fronts. We need better representations of visual categories that can enable fine-grained reasoning about their properties, as well as machine learning methods that can leverage ‘big-data’ to learn such representations. In order to enable benchmarks for evaluating recognition tasks and to guide learning and inference in models that solve challenging problems, we need to develop better ways of human-computer interaction. My research touches upon several such themes in the intersection of computer vision, machine learning, and human-computer interaction including:
منابع مشابه
Using Eye Movement Analysis to Study Auditory Effects on Visual Memory Recall
Recent studies in affective computing are focused on sensing human cognitive context using biosignals. In this study, electrooculography (EOG) was utilized to investigate memory recall accessibility via eye movement patterns. 12 subjects were participated in our experiment wherein pictures from four categories were presented. Each category contained nine pictures of which three were presented t...
متن کاملA Computational Approach towards Visual Object Recognition at Taxonomic Levels of Concepts
It has been argued that concepts can be perceived at three main levels of abstraction. Generally, in a recognition system, object categories can be viewed at three levels of taxonomic hierarchy which are known as superordinate, basic, and subordinate levels. For instance, "horse" is a member of subordinate level which belongs to basic level of "animal" and superordinate level of "natural object...
متن کاملHuman Action Recognition and Shape Segmentation-Recognition
Human Action Recognition. Human action recognition has broad range of applications such as video search, sports analysis, human robotics interactions, and health care. Our work is organized in two directions: 1) detailed pixel-level ‘motion and pose’, focusing on close interactions among people; 2) action recognition focusing on goal oriented motion, simplified as ‘action = motion + intention’....
متن کاملHow Humans Describe Short Videos
Recognition, manipulation and representation of visual objects can be simplified significantly by “abstraction”. By definition abstraction extracts essential features and properties while it neglects unnecessary details. We have conducted two sets of experiments in order to relate abstraction levels used by humans when describing videos, to abstraction level categories used in computer vision. ...
متن کاملActive Object Exploration in Toddlers and its Role in Visual Object Recognition
Adult active object exploration of novel objects is highly focused; particularly considering the time spent performing specific visual transformations of an object as found in previous experimental studies. The most stereotypical is rotating around object orientations where the object’s main axis is either elongated (e.g. a side) or foreshortened (e.g. the top or the bottom). These orientations...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013